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Abstract. The remarkable results of Foster and Vohra was a starting 
point for a series of papers which show that any sequence of outcomes can 
be learned (with no prior knowledge) using some universal randomized 
forecasting algorithm and forecast-dependent checking rules. We show 
that for the class of all computationally efficient outcome-forecast-based 
checking rules, this property is violated. Moreover, we present a proba- 
bilistic algorithm generating with probability close to one a sequence with 
a subsequence which simultaneously miscalibrates all partially weakly 
computable randomized forecasting algorithms. 

According to the Dawid's prequential framework we consider partial re- 
cursive randomized algorithms. 



1 Introduction 

Let a binary sequence wi,W2, • • ■ ,^n-i of outcomes is observed by a forecaster 
whose task is to give a probability p„ of a future event a;„ = 1. The evaluation of 
probability forecasts is based on a method called calibration: informally, following 
Dawid [1] forecaster is said to be well-calibrated if for any p* the event Ci;„ = 1 
holds in 100p*% of moments of time as he choose p„ ~ p*. (see also [2]). 

Let us give some notations. Let i7 be the set of all infinite binary sequences, 
S be the set of all finite binary sequences and A be the empty sequence. For 
any finite or an infinite sequence w = . . . ti;„ . . . , we write w" — coi . . . LUn (we 
put Wo = = A). Also, l{uj"-) = n denotes the length of the sequence us". If 
a; is a finite sequence and w is a finite or infinite sequence then xto denotes the 
concatenation of these sequences, x Q uj means that a; = for some n. 

In the measure-theoretic framework we expect that the forecaster has a 
method for assigning probabilities Pn of a future event w„ = 1 for all possible 
finite sequences oji,uj2, ■ . . In other words, all conditional probabilities 

Pn = P(w„ = l|wi,a;2, . . . , w„-i) 

must be specified and the overall probability distribution in the space il of all 
infinite binary sequences will be defined. But in reality, we should recognize that 
we have only individual sequence wi, 0^2, ■ ■ • , i^n-i of events and that the corre- 
sponding forecasts Pn whose testing is considered may fall short of defining a full 
probability distribution in the whole space f2. This is the point of the prequen- 
tial principle proposed by Dawid [1] . This principle says that the evaluation of a 



probability forecaster should depend only on his aetual probability foreeasts and 
the corresponding outcomes. The additional information contained in a proba- 
bility measure that has these probability forecasts as conditional probabilities 
should not enter in the evaluation. According to Dawid's prequential framework 
we do not consider numbers p„ as conditional probabilities generated by some 
overall probability distribution defined for all possible events. In such a way, a 
deterministic forecasting system is a partial recursive function / : S — > [0, 1]. We 
suppose that a valid forecasting system / is defined on all finite initial fragments 
wi, . . . , ojn-i, • . • of an analyzed individual sequence of outcomes. 

First examples of individual sequences for which well-calibrated deterministic 
forecasting is impossible (non-calibrable sequences) were presented by Oakes [6] 
(see also Shervish [9]). Unfortunately, the methods used in these papers, and 
in Dawid [1], [2], do not comply with prequential principle; they depend on 
some mild assumptions about the measure from which probability forecasts are 
derived as conditional probabilities. The method of generation the non-calibrable 
sequcnccis with probability arbitrary close to one presented in V'yugin [11] also is 
based on the same assumptions. In this paper we modify construction from [11] 
for the case of partial deterministic and randomized forecasting systems do not 
corresponding to any overall probability distributions. 

Oakes [6] showed that any everywhere defined forecasting system / is not 
calibrated for a sequence uj = uiiU2 ■ ■ ■ defined 

^ ^ r 1 if Pi < 0.5 
' 1^ otherwise 

and Pi = f{ijJx . . .Wi-i), z = 1,2, . . .. 

Foster and Vohra [3] showed that the well-calibrated forecasts are possible if 
these forecasts are randomized. By a randomized forecasting system they mean a 
random variable /(a; x) defined on some probability space supplied by some 
probability distribution Pr^, where x G S is a. parameter. As usual, we omit the 
argument a, For any infinite u), these probability distributions Pr^i-i generate 
the overall probability distribution Pr on the direct product of probability spaces 
Q^,-i,i = l,2,.... 

It was shown in [3], [4] that any sequence can be learned: for any Zi > 0, 
a universal randomized forecasting system / was constructed such that for any 
sequence oj = uiiui2 ■ ■ ■ the overall probability Pr of the event 

<^ (1) 

tends to one as n ^ oo, where pi = is the random variable, I{p) is the 

characteristic function of an arbitrary subinterval of [0, 1]; we call this function 
a forecast-based checking rule. 

Lehrer [5] and Sandrony et al. [8] extended the class of checking rules to 
combination of forecast- and outcome-based checking rules: a checking rule is 
a function c(a;'~^,p) = S{u'^~^)I{p), where S : S ^ {0,1} is an outcome-based 



-Pi) 



checking rule, and I{p) is a characteristic function of a subinterval of [0, 1]. They 
also considered a more general class of randomized forecasting systems - random 
variables pi = f{a; uj^~^ ,p^~^), where = pi, . . . ,Pi~i is the sequence of past 
realized forecasts. 

For = 1, 2, . . ., let {5fc} be any sequence of outcome-based checkng rules and 
{/fc} be any sequence of characteristic functions of subintervals of [0, 1]. Sandrony 
et al. [8] defined a randomized universal forecasting system which calibrates all 
checking rules {(5^/^}, k = 1,2, . . ., i.e., such that for any zi > and for any 
sequence w = wioj2 ■ ■ the overall probability of the event (1) tends to one as 
n ^ oo, where Pi = and I{pi) is replaced on 5k{oJ^~^)Ik{Pi) for all 

A:=l,2,.... 

In this paper we consider the class of all computable (partial recursive) 
outcome-based checking rules {5k] and a slightly different class of random- 
ized forecasting systems: our forecasting systems are random variables Pi = 
/(a; w'"^) do not depending on past realized forecasts (this take a place for the 
universal forecasting systems defined in [3] and [10] ^ ). Concurrently, such a 
function can be undefined outside w, it requires that any well defined forecast- 
ing system must be defined on all initial fragments of an analyzed sequence of 
outcomes. This peculiarity is important, since we consider forecasting systems 
possessing some computational properties: there is an algorithm computing the 
probability distribution function of such forecasting system. This algorithm when 
fed to some input can never finish its work, and so, is undefined on this input. 

In this universal randomized forecasting algorithm which calibrates 

all computationally efficient outcome-forecast-based checking rules does not ex- 
ist. Moreover, we construct a probabilistic generator (or probabilistic algorithm) 
of non-learnable (in this way) sequences. This generator outputs with probabil- 
ity close to one an infinite sequence such that for each randomized forecasting 
system pi ~ f{a;uj''~^) some computable outcome-based checking rule S selects 
an infinite subsequence of tu on which the property (1) fails for some character- 
istic function I with the overall probability one, where the overall probability is 
associated with the forecasting system /. 

2 Miscalibrating the forecasts 

We use standard notions of the theory of algorithms. This theory is system- 
atically treated in, for example, Rogers [7]. We fix some effective one-to-one 
enumeration of all pairs (triples, and so on) of nonnegative integer numbers. We 
identify any pair (t, s) and its number {t, s); let p{{t, s)) = t. 

A function (j):A^TZis called (lower) semicomputable if {{r,x) : r < (j){x)} 
(r is a rational number) is a recursively enumerable set. A function ^ is upper 

^ Note that the algorithm from [8] can be modified in a fashion of [3], i.e., such that at 
any step of the construction past forecasts can be replaced on measures with finite 
supports defined on previous steps. Since these measures are defined recursively in 
the process of the construction, they can be eliminated from the condition of the 
universal forecasting algorithm. 



semicomputable if —<p is lower semicomputable. Standard argument based on 
the recursion theory shows that there exist the lower and upper semicomputable 
real functions 4'^{j,x) and (f>'^{k,x) universal for all lower semicomputable and 
upper semicomputable functions from x S; in particular every computable 
real function (j){x) can be represented as (j){x) = 4>~{j,x) = (j)'^{k,x) for all x, for 
some j and k. Let 4'j{j, x) be equal to the maximal rational number r such that 
the triple (r, j, x) is enumerated in s steps in the process of enumerating of the 
set {(r,j, a;) : r < (p{j,x), r is rational} and equals -co, otherwise. Any such 
function (j)~{j,x) takes only finite number of rational values distinct from -co. 
By definition, 4>~{j,x) < (t)J.i{j,x) for all j,s,x, and (f)~{j,x) = lim ^j{j,x). 

8 — ^OO 

An analogous non-increasing sequence of functions (j)f{k, x) exists for any upper 
semicomputable function. 

Let i = {t,k). We say that a real function (pi{x) is defined on x if given 
any degree of precision - positive rational number k > 0, it holds \(pf{t,x) — 
(j)j{k,x)\ < K for some s; <l)i{x) undefined, otherwise. If any such s exists then 
for minimal such s, (pi^K.{x) = (f>j{k, x) is called the rational approximation (from 
below) of c6;(.t) up to k: (bi,K.{x) midcfincd, otherwise. 

To define a measure P on i7, we define values P{z) = P{rz) for all intervals 
Fz = {uj € O : z ^ w}, where z € and extend this function on all Borel 
subsets of i? in a standard way. 

We use also a concept of computable operation ou E\Jfl (see [12]). Let F 
be a recursively enumerable set of ordered pairs of finite sequences satisfying 
the following properties: (i) {x,X) S F for each x-. (ii) if {x,y) e F, {x',y') G F 
and X Q x' then y Q y' or y' Q y for all finite binary sequences x,x',y,y'. A 
computable operation F is defined as follows 

F{uj) = supjy I a; C w and {x, y) ^ F for some a;}, 

where uj & and sup is in the sense of the partial order C on H. 

A probabilistic algorithm is a pair (L, F), where L{x) = L{rx) = 2"'^^) is the 
uniform measure on fi and F is a computable operation. For any probabilistic 
algorithm {L, F) and a set ^ C Q, we consider the probability L{lo : F{lo) € A} 
of generating by means of F a sequence from A given a uniformly distributed 
sequence to. 

A partial randomized forecasting system / is weakly computable if its weak 
probability distribution function (/?„(w"~^) = Pr„{/(a;"~-^) < i} is a partial 
recursive function from u"^^. 

Any function S : S {0,1} is called an outcome-based selection (or check- 
ing) rule. For any sequence w = W\0J2 ■ ■ •, the selection rule 5 selects a sequence 
of indices m such that 5{u)^*~^) = 1, i = 1, 2, . . ., and the corresponding subse- 
quence U)nii^n2 ■ ■ ■ of W. 

The following theorem is the main result of this paper. In particular, it shows 
that the construction of the universal forecasting algorithm from Sandrony ct 
al. [8] is computationally non-efficient in a case when the class of all partial 
recursive outcome-based checking rules {dk} is used. 



Theorem 1. For any e > a probabilistic algorithm {L,F) can be constructed, 
which with probability > 1 — e outputs an infinite binary sequence u = ujiu)2 ■ ■ ■ 
such that for every partial weakly computable randomized forecasting system f 
defined on all initial fragments of the sequence lo there exists a, computable se- 
lection rule S defined on all these fragments and such that for v = or for v = 1 
the overall probability of the event 



lim sup 



1 " 

- ^ 5{uj'-'^)h{pi){uJi - Pi) 



n 

2 = 1 



> 1/16 (2) 



equals one, where Iq and Ii are the characteristic functions of the intervals [0, 5) 
and [^,1], Pi = f{co^~^) is a random variable, i = 1,2,..., and the overall 
probability distribution is associated with f. 

Proof. For any probabilistic algorithm {L,F), we consider the function 

Q{x) = L{uj : X C F{uj)}. (3) 

It is easy to verify that this function is lower semicomputable and satisfies: 
QW < 1; Q{xO) + Q{xl) < Q{x) for all x. Any function satisfying these prop- 
erties is called semicomputable semimeasure. For any semicomputable semimea- 
sure Q a probabilistic algorithm {L, F) exists such that (3) holds. Though the 
semimeasure Q is not a measure, we consider the corresponding measure on the 
set Q 

Q{r,) = inf J2 Q(y)- 

n ^ — ^ 

We will construct a semicomputable semimeasure Q as a some sort of network 
flow. We define an infinite network on the base of the infinite binary tree. Any 
X G S defines two edges {x,xO) and {x,xl) of length one. In the construction 
below we will mount to the network extra edges (x.y) of length > 1, where 
x,y € S, X y and y ^ a;0,a;l. By the length of the edge {x,y) wc mean the 
number l{y) — l{x). For any edge a = {x,y) we denote by ai = x its starting 
vertex and by (J2 = y its terminal vertex. A computable function q{(j) defined 
on all edges of length one and on all extra edges and taking rational values is 
called a network if for all x G S 

a:ai=x 

Let G be the set of all extra edges of the network q (it is a part of the domain 
of q). By q-flow we mean the minimal semimeasure P such that P > R, where 
the function R is defined by the following recursive equations R{X) = 1 and 

Riy)= E 9(a)i?(ai) (4) 

cr:rT2=y 



{01 y ^ X. A network q is called elementary if the set of extra edges is finite and 
q{a) = 1/2 for almost all edges of unit length. For any network q, we define the 



network flow delay function (g'-delay function) 



d{x) = 1 — q{x, xO) — q{x, xl). 



The construction below works with all computable real functions (f>t{x), x S S, 
t = 1,2,.... We suppose that for any computable function (j) there exist infinitely 
many programs t such that (/>t = </>. ^ Any pair i = {t, s) is considered as a 
program for computing the rational approximation (f)t, Ks of (f)t from below 

up to Ks = l/s. 

By the construction below we visit any function <pt on infinitely many steps 
n. To do this, we use the function p{n): for any positive integer number i we 
have pin) = i for infinitely many n. 

Let /3 be a finite sequence and 1 < < ?(/?). A bit /3fe of the sequence /? is 
called hardly predictable by a program i = {t, s) if 4>t,Ks{P''~^) is defined and 



Lemma 1. Let i = {t, s) be a program and fx be an arbitrary sufficiently small 
positive real number. Then for any binary sequence x of length n the portion of 
all sequences 7 of length K = \{2 + ij,)i~\n (in the set of all finite sequences of 
length K ) such that 

1) 4>t,KA^l'') defined for all < k < K, 

2) the number of hardly predictable bits of 7 by the forecasting program i is 
less than in, 

is < 2-2M'in+o(iog(m)) ^11 Sufficiently large n. 

Proof. Any function a{x), where x G S and a{x) € {A,B}, is called labelling 
if a{xO) ^ a{xl) for all x € S. For any 7 of length K and for any k such that 
I < k < K, define (7(7'^+^) = A and (T{'y''jk+i) = B if the bit ^k+i of the 
sequence x^ is hardly predictable, where we denote = 1 — 9 for any binary bit 
9. Since (f)t,Ks{x^'^) is defined for all < fc < ii', then ct(7*^+^) is also defined for 
all these k. This partial labelling a can be easily extended on the set of all binary 
sequences of length K in many different ways. We fix some such extension. Then 
the total number of all 7 satisfying l)-2) does not exceed the total number of all 
binary sequences of length K with < in labels A. Therefore, for all sufficiently 
large n, the portion of these 7 does not exceed 



where H(r) = — rlogr — (1 — r)log(l — r). □ 
In the following we put ^ = 1/ log(i + 1). 

We define an auxiliary relation B{i,q'^~^,a,n) and a function l3{x, q"^^ ,n). 
Let x,(3 € S. The value of B{i, {x, (3), n) is true if the following conditions 
hold: 

^ To obtain this property, we can replace the sequence 4>t{x) on a sequence 't>'(t^s) {x) = 
<f)t{x) for all 8. 





1 otherwise 




- n> {l+\{2 + log-\i + l))i])l{xy, 

- l{(3) = n and x ^ f3; 

- (P'-'^{(3i) < 1 for all j such that 1 < j < n; 

- for all j, l{x) <]<{!+ \{2 + \og~'^{i + l))i\)l{x), the value is 
computed in < n steps, and for at least il{x) of these j the bit (}j is hardly 
predictable by the program i = {t, s). 

The value of B{i,q"'~^, {x,/3),n) is false, otherwise. Define 

(3{x, q^-\n) = mm{y : p{l{y)) = p{l{x)), B{p{l{x)), q"-\ {x, y), n)}. 

Here min is considered for lexicographical ordering of strings; we suppose that 
min0 is undefined. 

Construction. Let p{n) = (n + no)^ for some sufficiently large no (the value 
no will be specified below in the proof of Lemma 5). 

Using the mathematical induction by n, wc define a sequence of elementary 
networks. Put q°{cr) = 1/2 for all edges a of length one. 

Let n > and a network g""^ is defined. Let be the (/""^-delay function 
and let G"^^ be the set of all extra edges. We suppose also that 1{(T2) < n for 
alia e G"-i. 

Let us define a network g". At first, we define a network flow delay function 
d" and a set G". The construction can be split up into two cases. 

Let wii, g"^^) be equal to the minimal m such that p{m) = i and m > l{cr2) 
for each extra edge a G G"~^ such that p{l{ai))) < i. 

The inequality w{i,q"^) ^ w{i,q'^~^) can be induced by some task j < i 
that mounts an extra edge a = {x, y) such that l{x) > 'w{i, q'^~^) and p{l{x)) = 
p{l{y)) = j- Lemma 2 (below) will show that this can happen only at finitely 
many steps of the construction. 

Case 1. w{p{n), q'^~^) = n (the goal of this part is to start a new task i = p{n) 
or to restart the existing task i = p{n) if it was destroyed by some task j < i at 
some preceding step). 

Put (P{y) = l/p(n) for l{y) = n and define (P{y) = d^~^{y) for all other y. 
Put also G" = G"-^ 

Case 2. w{p{n), q"^^^) < n (the goal of this part is to process the task i = 
p{n)). Let Cn be the set of all x such that wii, q^~^) < l{x) < n, < d"~^{x) < 
1, the function /3(x, g"^^, n) is defined ^ and there is no extra edge a € G"~^ 
such that (Ti = X. 

In this case for each x € Cn define d"(/3(a;, n)) =0, and for all other y 
of length n such that x \Zy define 



l-rf"-i(a;)' 



Define d"(j/) = rf" ^{y) for all other y. We add an extra edge to G" ^, namely, 
define 

G" = G"-i U {{x,p{x,q"-\n)) : x e G„}. 



^ In particular, p(l{x)) = i and l{P{x, g" ^ = n. 



We say that the task i — p{n) mounts the extra edge {x, /3{x,q'^~^ ,n)) to the 
network and that all existing tasks j > i are destroyed by the task i. 
After Case 1 and Case 2, define for any edge a of unit length 

g"(a) = i(l-d"(ai)) 

and q"{a) ~ d"{ai) for each extra edge cr G G". 

Case 3. Cases 1 and 2 do not hold. Define (T = d^'^ , q" = G" = G"-i. 
As the result of the construction we define the network q = lim g'", the 

n— >oo 

network flow delay function d = lim and the set of extra edges G = U„G". 

n — *oo 

The functions q and d arc computable and the set G is recursive by their 
definitions. Let Q denotes the g-flow. 

The following lemma shows that any task can mount new extra edges only 
at finite number of steps. Let G{i) be the set of all extra edges mounted by the 

task i, w{i, q) = lim„^oo ^{1, g"). 

Lemma 2. The set G{i) is finite, w{i, q) exists and w{i, q) < 00 for all i. 

Proof. Note that if G(j) is finite for all j < i, then w{i, q) < 00. Hence, we must 
prove that the set G{i) is finite for any i. Suppose that the opposite assertion 
holds. Let i be the minimal such that G{i) is infinite. By choice of i the sets 
G(j) for all j < i are finite. Then w{i, q) < 00. 

For any x such that l{x) > w{i,q), consider the maximal m such that for 
some initial fragment C x there exists an extra edge a = {x"^,y) G G(i). If 
no such extra edge exists define m = w{i,q). By definition, if d{x"^) ^ then 
l/rf(a;™) is an integer number. Define 

r l/d(x™) if d{x^) 0, l(x) > w{i, q) 
u{x) = < p{w{i,q)) if l{x) < 'w{i,q) 
[ otherwise 

By construction the integer valued function u(x) has the property; u{x) > u{y) 
if X Qy. Besides, if u{x) > u{y) then u{x) > u{z) for all z such that x ^ z and 
l{z) = l{y). Then the function 

u{ijj) = min{n : u{u)^) = u(a;") for all i > n} 

is defined for all w G /?. It is easy to see that this function is continuous. Since 

f2 is compact space in the topology generated by intervals Fx. this function 
is bounded by some number m. Then u{x) = u{x''^) for all l(x) > m. By the 
construction, if any extra edge of ith type was mounted to G{i) at some step then 
u{y) < u{x) holds for some new pair (.t, y) such that x Qy. This is contradiction 
with the existence of the number m. □ 

An infinite sequence a G 17 is called an i- extension of a finite sequence x if 
X Q a and B{i, q"~^, x, a", n) is true for almost all n. 

A sequence a G /2 is called i-closed if d{a") = 1 for some n such that p{n) = i, 
where d is the g-delay function. Note that if a G G{i) is some extra edge (i.e. an 
edge of ith type) then B{i, g"~^, a, n) is true, where n = l{a2)- 



Lemma 3. Let for any initial fragment w" of an infinite sequence u some i- 
extension exists. Then either the sequence u will be i-closed in the process of 
the construction or cj contains an extra edge of ith type (i.e. E w for some 
(j&G(i)). 

Proof. Let a sequence w is not i-closed. By Lemma 2 the maximal m exists such 

that p(m) = i and d{Lo™) > 0. Since the sequence oj™ has an i-cxtcnsion and 
< 1, by Case 2 of the construction a new extra edge (w™, j/) of ith type 
must be mounted to the binary tree. By the construction d{y) = and d{z) ^ 
for all z such that Q z, l{z) = l{y), and ^; 7^ y. By the choice of m we have 
2/ E □ 

Lemma 4. It holds Q{y) = if and only if q{a) = for some edge a of unit 
length located on y (this edge satisfies (J2 ^y)- 

Proof. The necessary condition is obvious. To prove that this condition is suffi- 
cient, let us suppose that q{y^, y"^^) = for some n < l{y) but Q{y) ^ 0. Then 
by definition d{y'^) = 1. Since Q(y) 7^ an extra edge (x,z) G G exists such 
that a; C y" and y"+^ E z. But, by the construction, this extra edge can not be 
mounted to the network since d{z"') = 1. This contradiction proves the 

lemma. □ 

For any semimeasure P define Ep = {w G /2 : Vn(P(u;") ^ 0)} - the support 
set of P. It is easy to see that P{Ep) = P{Q). By Lemma 4 Eq = Q\\J^(^x)=irx. 

Lemma 5. It holds Q{Eq) > 1 — ^e. 

Proof. We bound from below. Let R be defined by (4). By definition of 

the network fiow delay function, we have 

^ R{u)= Yl a-d{umu)+ Yl (5) 

«:i(u)=n+l u:l{u)=n cr:<7eG,i(<72)=n+l 

Define an auxiliary sequence Sn = J2 R{u) — J2 q{o')R{o'i). At 

u:l(u)=n a:aEG,l{a2)=n 

first, we consider the case w{p{n),q"~^) < n. If there is no edge a gG such that 
l{o'2) = n then Sn+i > Sn- Suppose that some such edge exists. Define 

P[u, a) <S=^ l{u) = l{a2)kai C uku ^ cr2&cr € G. 

By definition of the network flow delay function, we have 

Y d{u)R{u)^ Y ^(^2) Yl = 

u:l{u)=n (7:a£G ,l{a'2,)—n u:P{u,<t) 

= Y <l{cr)Ria^). (6) 



Here we used the inequality ^ R{u) < R{(Ti) — d{ai)R{ai) for all a e G 

u:P(u,a-) 

such that l{cr2) = n. Combining this bound with (5) wc obtain Sn+i > Sn- 
Let us consider the case 'w{p{n), q^~^) = n. Then ^ d{u)R{u) < p{n) = 

u:l{u)=n 

(n + no)~^. Combining (5) and (6) we obtain Sn+i > 5'„ — (n + no)~^ for all n. 

oo 

Since = 1, this implies > 1 — + ?^o)~^ > 1 — for some sufhciently 

i=l 

large constant no- Since Q > i?, it holds 

Q{n) = inf y Q(u) >mfSn>l- le. 

l{u)=n 

Lemma is proved. □ 

Lemma 6. There exists a set U of infinite binary sequences such that Q{U) < e/2 
and for any sequence u) G Eq \ U for each partial computable forecasting system 
the condition (2) holds. 

Proof. Let w be an infinite sequence and let / be a partial computable forecasting 
system such that the corresponding is defined for all n. Let i = {t,s) 

be a program for computing the rational approximation (f)t, ks from below up to 

Ks = 

If d{u!"^) = 1 for some m such that p(m) = i then for every /? of length 
(1 + [(2 + log~^(i + l)]i)m such that C /? there are < im bits hardly 
predictable by the forecasting program i. 

We show that Q-measure of all intervals generated by such /? becomes arbi- 
trary small for all sufficiently large i. Since there are no extra edges a such that 

C (Ti; the measure Q when restricted on interval Fi^m is proportional to the 
uniform measure. Then by Lemma 1, where /i = log~^(i + 1), Q-measure of all 
such /3 decreases exponentially by im. Therefore, for each j there exists a number 
m.j such that Q{Uj) < 2~'^-'+^\ where Uj is the union of all intervals defined 
by all P of length (1 + [(2 + log^^(i + l))i])m for m > nij containing < im bits 
hardly predictable by the forecasting program i = p{m). Define U = Uj^^Uj, 
where fc = [- loga e - 1] . We have Q{U) < e/2. 

Define a selection rule 7 as follows: 

— define 7(0;-'"^) = 1 if ai C lo^~^ C a2 for some a € G{i) and the jth bit of 
(72 is hardly predictable by the forecasting program i; 

— define 7(0;-'"^) = otherwise. 

We also define two selection rules J^, where u = 0,1, 



Suppose that uj and (j)t{oj'^) is defined for all n. Then uj is an i-extension 
of for each n. Since for each n the sequence is not i-closed, by Lemma 3 



there exists an extra edge a <E G{i) sueh that (72 E In the following, let 
m = l{ai), n = (1 + f(2 + log"^(i + l))i])m. 

Then by the construction the selection rule S^{ui^~^) = 7(0;^"^) J^(w^~^), 
for v = 01 for i' ~ 1, selects from a fragment of uj of length n a subsequence 
a;(^,...,a;t; of length / > im/2. Since by definition these bits arc hardly pre- 
dictable, we have uit^ = 1 for all j such that 1 < j < I if v = 0, and uji. = for 
all these j if v = 1. 

Let pj = f{w^~^), j = 1,2,..., be an arbitrary computable randomizing 
forecasting system (it is a random variable) defined on all initial fragments of 
oj = UJ1UJ2 .... Then (f){Lj^^^) = Pr{pj > i} is a computable real function. By 
definition tj) = (j)^ for infinitely many t and 

<Pt,.A^'-^) < < <f>t,.A^'-^) + (7) 

for all s and j. Consider two random variables, for 1/ = and for 1/ = 1, 

n 

i=i 

Suppose that / > im/2 holds for ly = 0. Then using (7) we obtain 

n , , 

E{'&n,o)> Yl So{uj^-')Pr{pj<-}--m> 

j=m+l 

im.l , 
>^{^-Ks)-m (8) 

Since n = (1 + [(2 + log~^(i + l))i] )m, i can be arbitrary large and we visit any 
pair i = {t, s) infinitely often, we obtain from (8) 

limsup-£;(i?„,o) > 1/16. (9) 

n— >oo 

Analogously, if v = 1 we obtain 

liminf-£;(i9„,i) < -1/16. (10) 

n — *oo Ti ' 

The martingale strong law of large numbers says that for u = 0,1 with Pr- 
probability one 

1 " 1 

- ^ 64uj^-')hipj){u;j - Pj) - -E{^n,.) - (11) 

as n 00. Combining (9), (10) and (11) we obtain (2). 
Lemma 6 and Theorem 1 are proved. □ 

The following theorem is a generalization of the result from V'yugin [11] for 
partial defined computable deterministic forecasting systems. 



Theorem 2. For any e > a probabilistic algorithm {L, F) can be constructed, 
which with probability > 1 — e outputs an infinite binary sequence u = oj\W2 ■ ■ ■ 
such that for every partial deterministic forecasting algorithm f defined on all 
initial fragments of the sequence to a computable outcome-based selection rule 6 
exists defined on all these fragments such that 



The proof of this theorem is based on the same construction. 
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